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1 Poster session 2: Orthogonal code generator for 3G wireless transceivers 
Boris D. Andreev, Edward L. Titlebaum, Eby G. Friedman 

April 2003 Proceedings of the 13th ACM Great Lakes symposium on VLSI GLSVLSI '03 

Publisher: ACM Press 

Full text available: ^ pdf(152.16 KB) Additional Infonnation: full citation , abstract , references , index terms 

Orthogonal variable spreading factor (OVSF) codes are standard in third generation UI^TS 
cellular systems. The efficient generation of these codes is essential for reducing the area 
and power of wireless transceivers. In this paper, the basic properties of this family of 
codes are analyzed from an RTL perspective and two efficient hardware code generators 
are proposed. Tradeoffs and design solutions as well as low power considerations are 
discussed. These results represent the first reported impi ... 

Keywords: 3GPP, CDI^IA, OVSF codes, UMTS, VLSI, WCDI^A 

2 W1-C: general symposium: New non-blocking EOVSF codes for multi-rate WCDMA 
<^ system 

^ Yih-Fuh Wang, Hsing-Hu Chen, Tun-Ying Lin 

July 2006 Proceeding of the 2006 international conference on Communications and 
mobile computing IWCMC '06 

Publisher: ACM Press 

Full text available: ^ pdf(886.94 KB) Additional Infonnation: ftill citation , abstract , references , index terms 

Orthogonal variable spreading factor (OVSF) codes are employed In the third generation 
(3G) wideband code division multiple access (WCDMA) wireless system as channelization 
codes. Any two codes OVSF of different levels are orthogonal if and only if one of two 
codes is not ancestor/descendant in each other. Therefore, when an OVSF code is 
assigned to a user, it blocks all of its ancestor and descendant codes. This results in a 
major drawback of OVSF codes, called blocking property: When an OVSF c ... 

Keywords: EOVSF codes, OVSF codes, WCDI^A, third generation (3G) 



Ad hoc networks: OVSF-CDMA code assignment in wireless ad hoc networks 
Peng-Jun Wan, Xiang-Yang Li, Ophir Frieder 

October 2004 Proceedings of the 2004 joint worl<shop on Foundations of mobile 
computing DIALIwi-POMC '04 
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Publisher: ACM Press 

Full text available: fg| odtn 98.79 KB^ Additional Information: full citation , ibstracl. refgrgnoes . citjngs. indgx 

l^j-t£ terms 

Orthogonal Variable Spreading Factor (OVSF) CDMA code provides a means of support of 
variable rate data service at low hardware cost. In contrast to the conventional orthogonal 
fixed-spreading-factor CDMA code, OVSF-CDMA code consists of an infinite nunnber of 
codewords with variable rates but not every pair of codewords are orthogonal to each 
other. In an OVSF-CDMA wireless ad hoc network, a code assignment has to be conflict- 
free, i.e., two nodes can be assigned the same codeword or two non-ort ... 

Keywords: OVSF-CDMA, approximation algorithms, code assignnrient, graph theory, 
system design 

4 Region division assignment: a new OVSF code reservation and assignment schenne Q 
for downlink capacity in W-CDMA systems 

Rujipun Assarut, Ken'ichI KawanishI, Ushio Yamamoto, Yoshikuni Onozato 
May 2006 Wireless Networks, volume 12 issue 3 
Publisher: Kluwer Academic Publishers 

Full text available: ^ pdf(3.18 MB) Additional Information: full citation , abstract, references , index terms 

This work focuses on the efficient management of orthogonal-variable-spreading-factor 
(OVSF) codes for multimedia communications in the W-CDMA systems. Because these 
systems must assign only OVSF codes that are mutually orthogonal, even if they have 
sufficient transmission capacity they block connections for which no orthogonal OVSF 
codes are available. This code blocking can, with extra overhead, be eliminated by 
reassigning codes, but in this paper we propose an OVSF code management scheme 
des ... 

Keywords: OVSF, W-CDMA, code assignment, code blocking 



5 Closed-loop architecture and protocols for rapid dynamic spreading gain adaptation Q 
in CDMA networks 

Lih-feng Tsaur, Daniel C. Lee 

August 2006 IEEE/ACM Transactions on Networking (TOIM), volume 14 issue 4 
Publisher: IEEE Press 

Full text available: ^ pdf(928.82 KB) Additional Information: full citation, abstract , references , index terms 

We present a closed-loop architecture and protocols for rapid dynamic spreading gain 
adaptation and fast feedback between a transmitter and a receiver communicating with 
each other in CDMA networks. These protocols and architecture do not require the 
transfer of an explicit control message indicating the change of CDMA spreading gain from 
transmitter to receiver. Also, with these protocols, the transmitter can change the 
spreading gain symbol-by-symbol as opposed to frame-by-frame, and feedback ... 

Keywords: CDMA, OVSF codes, rate adaptation 

6 An innovative simulation tool for advanced si g nal processing in UMTS systems Q 
Dania Marabissi, Marco Michelini, Luca Simone Ronga 

September 2004 Wireless Networlcs, volume 10 issue 5 
Publisher: Kluwer Academic Publishers 

Full text available: '^ pdf(545.12 KB^ Additional Information: full citation , abstract , references , index terms 
Link-level simulations are essential in the design of UMTS communication systems. The 
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large number of interdependent variables mal<es it impossible to derive easy design steps 
witliout an efficient modeling of tlie environments and tlie implemented reception 
schemes. In this paper, a novel tool for UMTS design is presented. The tool Includes a fast 
C++ simulation engine and a complete 3GPP library to model the uplink transmission 
chain. As an example, a series of Monte Carlo performance simulatio ... 

Keywords: 3G-simulation environment, CDMA advanced receivers, DSP system design, 
code division multiple access (CDMA), fading channel models, multirate systems, object- 
oriented simulation tool 



7 W1-C: general symposium: A multi-rate CDMA system with block-spreading schemes Q 
^ for anti-interference and high frequency efficiency 

Mitsuhiro Tomita, Noriyoshi Kuroyanagi, Kohei Ohtal<e, Naoki Suehiro, Sinya Matsufuji 
July 2006 Proceeding of the 2006 international conference on Communications and 

mobile computing IWCMC '06 
Publisher: ACM Press 

Full text available: ^ pclf(342.74 KB) Additional Information: fuil citation , abstract , references , index terms 

For mobile systems liigher than 3G, multi-rate data service is becoming an important 
issue. This paper proposes a multi-rate block-spreading CDMA system as an efficient 
scheme which is capable of intra-ceil-interference free operation and inter-ceil- 
interference reduction by a factor of reciprocal of the spreading factor. With a use of an 
accurate pilot transmission scheme, it is shown that a high frequency utilization efficiency 
can be achieved for various mixed data rate services. 

8 Power control based QoS provisioning for multimedia in W-CDMA Q 
Ozgur Gtirbuz, Henry Owen 

January 2002 Wireless Networks, volume 8 issue i 
Publisher: Kluwer Academic Publishers 

Full text available: pdf(247.47 KB) Additional Information: full citation , abstract , references , index terms 

Third generation wireless communication systems will support multimedia, and W-CDI^A 
will be tlie common air interface teclinology. Due to tlie interference limited nature of 
CDMA, power is tine main resource of the networl<, and power control is a means of 
resource management. In this article, we introduce Dynamic Resource Scheduling (DRS) 
as a frameworl< which employs power control for QoS provisioning of multimedia traffic in 
W-CDMA. In DRS, we propose the application of optimal power assignment to ... 

Keywords: WCDMA, power control, wireless QoS 



9 Tools and Methodologies: Simulation tools for advanced signal processing in UMTS Q 
^ s ystems 

Dania Marabissi, Marco Michelini, Luca Simone Ronga 

September 2002 Proceedings of the 5th ACM international worlcshop on Modeling 

analysis and simulation of wireless and mobile systems MSWiM '02 

Publisher: ACM Press 

Full text available: ^ pdfr317.64 KB^ Additional Information: full citation , abstract , references , index terms 

Link-level simulations are essential in the design of UMTS communication systems. The 
large number of inter-dependent variables makes it impossible to derive easy design 
steps without an efficient modeling of the environments and the implemented reception 
schemes. In this paper, a novel tool for UMTS design is presented. The tool includes a fast 
C++ simulation engine and a complete 3GPP library to model the uplink transmission 
chain. As an example, a series of Monte Carlo performance simulations ... 
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Keywords: 3G-simulation environment, CDMA advanced receivers, DSP system design, 
code division multiple access (CDMA), fading channel models, multirate systems, object 
oriented simulation tool 



10 Standards: Proposed american standard: bit sequencin g of the american standard 
^ code for information interchange (ACSII) in serial-bv-bit data transnnission 
^ S, Gorn, R. W. Bemer, J. Green, E. Lohse 

June 1964 Communications of the ACM, volume i issue 6 

Publisher: ACM Press 

Full text available: l7? |pdf(610.39 KB) Additional Information: full citation 



11 M1-A: communication and information theory symposium: Performance analysis of Q 
^ multi chip/data rate DS-CDMA signals over multipath Rayleigh fading channels 
^ Ertan Ozturk 

July 2006 Proceeding of tlie 2006 international conference on Communications and 

mobile computing IWCMC '06 
Publisher: ACM Press 

Full text available: ^ pdf(392.30 KB) Additional Information: full citation , abstract , references , index terms 

This paper Investigates the Probability error (P^) performance of Multi-Chip/Data Rate 

Direct Sequence Code Divisions Multiple Access (MCDR/DS-CDMA) systems over multi- 
path Rayleigh fading channels. Two chip waveforms, raised cosine (RC) and an orthogonal 
wavelet, are compared numerically In terms of the Re. The results represent that the 
wavelet based system significantly outperforms the RC based system in terms of the P^*. 

On the other hand, the wavelets have greate ... 

Keywords: chip waveforms, multi -chip/data rate CDMA, multipath fading channels, 
performance analysis, wavelets 



12 Correspondences of 8-bit and Hollerith codes for computer environments— a USASI Q 

tutorial 
E. Lohse 

November 1968 Communications of the ACM, volume ii issue ii 
Publisher: ACM Press 

Full text available: ^ pdf(709.73 KB) Additional Information: full citation , abstract 

The correspondence tables In the document reflect USASCII standard code assignments 
as well as other codes. Comments that refer to the assignments of characters or character 
sets in columns 8 through 15 of Table 1 as a basis for standardization are solicited. 

Keywords: USA standard, card code, hole-patterns, hole-patterns assignment, punched 
card, punched card code, punched card systems 



13 Native code compilation of Eriang's bit syntax 
Per Gustafsson, Konstantinos Sagonas 

October 2002 Proceedings of tlie 2002 ACM SIGPiJVN workshop on Eriang ERLANG '02 

Publisher: ACM Press 

Full text available: ^pdf (196.81 KB) Additional Information: full citation , abstract , references , citing s 

Eriang's bit syntax caters for flexible pattern matching on bit streams (objects known as 
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binaries). Binaries are nowadays heavily used in typical Eriang applications such as 
protocol programming, which in turn has created a need for efficient support of the basic 
operations on binaries. To this effect, we describe a scheme for efficient native code 
compilation of Eriang's bit syntax. The scheme relies on partial translation for avoiding 
code explosion, an ... 

14 Enhancin g the performance of 16-bit code using augmenting instructions 
Arvind Krishnaswamy, Rajiv Gupta 

June 2003 ACM SIGPLAN Notices , Proceedings of the 2003 ACM SIGPLAN conference 
on Language, compiler, and tool for embedded systems LCTES '03, volume 

38 Issue 7 

Publisher: ACiVI Press 

Full text available- IS) pdf (276.13 KB) Additional Information: full citation , abstract, references , citin gs , index 
" terms 

In the embedded domain, memory usage and energy consumption are critical constraints. 
Dual width instruction set embedded processors such as the ARM provide a 16-bit 
instruction set in addition to the 32-bit instruction set to address these concerns. Using 
16-bit instructions one can achieve code size reduction and I-cache energy savings at the 
cost of performance. We have observed that throughout 16-bit Thumb code there exist 
Thumb instruction pairs that are equivalent to a single ARI^l instruct! ... 

Keywords: 16-bit thumb ISA, 32-bit ARM ISA, AX Instructions, code size, embedded 
processor, instruction coalescing, performance 



1 5 Code compression: Compiler optimization and ordering effects on VLIW code 

compression 

Montserrat Ros, Peter Sutton 

October 2003 Proceedings of the 2003 international conference on Compilers, 
architecture and synthesis for embedded systems CASES *03 

Publisher: ACM Press 

Fuli text available* IS Ddf(334 1 8 KB) A^^'^'^"®' Information: full citation , abstract , references , citings, index 

terms 

Code size lias always been an important issue for all embedded applications as well as 
larger systems. Code compression techniques have been devised as a way of battling 
bloated code; however, the impact of VLIW compiler methods and outputs on these 
compression schemes has not been thoroughly investigated.This paper describes the 
application of single- and multiple-instruction dictionary methods for code compression to 
decrease overall code size for the TI Ti^S320C6xxx DSP family. The compression ... 

Keywords: VLIW, code compression, compiler optimizations 



D ynannic coalescing for 16-bit instructions | 
Arvind Krishnaswamy, Rajiv Gupta 

February 2005 ACM Transactions on Embedded Computing Systems (TECS), volume 4 

Issue 1 

Publisher: ACM Press 

Full text available: 1S|j)df(487.89 KB), Additional Information: full citation, abstract, references, citings, index 
• : terms 

In the embedded domain, memory usage and energy consumption are critical 
constraints. Embedded processors such as the ARM and MIPS provide a 16-bit instruction 
set, (called Thumb in the case of the ARM family of processors), in addition to the 32-bit 
instruction set to address these concerns. Using 16-bit Instructions one can achieve code 
size reduction and Instruction cache energy savings at the cost of performance. This paper 
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presents a novel approach that enhances the performance of 16-bit Thu ... 

Keywords: 16-bit Thumb ISA, 32-bit ARM ISA, AX instructions, Embedded processor, 
code size, energy, instruction coalescing, performance 



17 Code g eneration for compiled bit-true simulation for DSP application 
L. De Coster, M. Ade, R. Lauwereins, J. Peperstaraete 

December 1998 Proceedings of the 11th international symposium on System 
synthesis ISSS '98 

Publisher: IEEE Computer Society 

Full text available:^ . o *iir»v 1^ 

Tgj,i3gi(1-13 MB)JC|y Additional Information: full citation , references, citings , index terms 

Publislier Site 



18 Standards: Proposed anrierican standard: perforated tape code for information 
interchan ge 

June 1964 Communications of the ACM, volume i issue 6 
Publisher: ACM Press 

Full text available: S pdf(376.46 KB) Additional Information: full citation 




19 Multlattribute hashing using Gray codes 
Christos Faloutsos 

June 1986 ACM SIGMOD Record , Proceedings of the 1986 ACM SIGMOD international 

conference on Management of data SIGMOD '86, volume is issue 2 
Publisher: ACM Press 

en* ^ I ui 01 ^*/oQo 7o i/D\ Additional Information: full citation , abstract , references , citings, index 
Full text available: TO pdf(883.73 KB) ^ 

^ terms 

Multlattribute hashing and its variations have been proposed for partial match and range 
queries in the past. The main idea is that each record yields a bitstring @@@@ ('Vecord 
signature"), according to the values of its attributes. The binary value (@@@@)2 of this 
string decides the bucket that the record is stored. In this paper we propose to use Gray 
codes Instead of binary codes, in order to map record signatures to buckets. In Gray 
codes, successive cod ... 

20 Fast software implementation of error detection codes 
David C. Feldmeler 

December 1995 IEEE/ ACM Transactions on Networking (TON), volume 3 issue 6 
Publisher: IEEE Press 

Full text available: ^ pdf(1.25MB) Additional Information: full citation , references , citings, index terms 
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1 Track 14: quantum computing: Improving quantum circuit dependability with 

^ reconfigurable quantum g ate array s 

^ Mihal Udrescu, Lucian Prodan, Mircea VIDdutiu 

May 2005 Proceedings of the 2nd conference on Computing frontiers CF '05 

Publisher: ACM Press 

Additional Information: full citation , abstract , references , citing s, index 
terms 



Full text available: "g) pdf(34373 KB) 



The need for error detection and correction techniques is vital in quantum computation, 
due to the omnipresent nature of quantum errors. No realistic prospect of an operational 
quantum computational device may be warranted without such mechanisms. Therefore, 
the fact that error detecting and correcting techniques have been developed has 
enhanced the feasibility of a potential quantum computer [15] [18]. This paper presents a 
methodology for improving the fault tolerance of quantum circuits by us ... 



Keywords: accuracy threshold, coding, reconfigurable quantum gate arrays 



2 



Signature of symmetric rational matrices and the unitary dual of lie g rou ps 
Jeffrey Adams, B. David Saunders, Zhendong Wan 

July 2005 Proceedings of the 2005 International symposium on Symbolic and 
algebraic computation ISSAC '05 

Publisher: ACM Press 

Full text available: ^ pdf(215.88 KB) Additional Information: fuli citation , abstract , references , index terms 

A key step in the computation of the unitary dual of a Lie group is the determination If 
certain rational symmetric matrices are positive semi-definite. The size of some of the 
computations dictates that high performance integer matrix. computations be used. We 
explore the feasibility of this approach by developing three algorithms for integer 
symmetric matrix signature and studying their performance both asymptotically and 
experimentally on a particular matrix family constructed from the excepti ... 

Keywords: lie group, matrix signature, symmetric matrix 



Session 3A: Exponential lower bound for 2-query locally decodable codes via a 

quantum argument 

lordanis Kerenidis, Ronald de Wolf 
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June 2003 Proceedings of the thirty-fifth annual ACM symposium on Theory of 
computing STOC '03 

Publisher: ACM Press 

Full text available- 1?1 Ddf( 313 57 KB) Additional Infonnation: full citation , abstract , references , citings, index 

u la-fi— S ■- terms 

A locally decodable code encodes n-bit strings x in m-bit codewords C(x), in such a way 
that one can recover any bit Xj from a corrupted codeword by querying only a few bits of 

that word. We use a quantum argument to prove that LDCs with 2 classical queries need 
exponential length: m=2^("K Previously this was known only for linear codes (Goldrelch et 
al. 02). Our proof shows that a 2-query LDC can be decoded with only 1 ... 

Keywords: locally decodable codes, private information retrieval, quantum computing 



Session 9A: Generic quantum Fourier transforms 
Cristopher Moore, Daniel Rockmore, Alexander Russell 

January 2004 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete 

algorithms SODA '04 
Publisher: Society for Industrial and Applied Mathematics 

Full text available: ^ pdf(1 70.02 KB ) Additional Information: full citation , abstract , references , citings 

The quantum Fourier transform (QFT) is the principal ingredient of most efficient quantum 
algorithms. We present a generic framework for the construction of efficient quantum . 
circuits for the QFT by "quantizing" the highly successful separation of variables technique 
for the construction of efficient classical Fourier transforms. Specifically, we use Bratteli 
diagrams, GelTand-Tsetlin bases, and strong generating sets of small adapted diameter to 
provide efficient quantum circuits ... 

Generic quantum Fourier transforms 

Cristopher Moore, Daniel Rockmore, Alexander Russell 

October 2006 ACI^ Transactions on Algorithms CTALG), volume 2 issue 4 

Publisher: ACM Press 

Full text available: " ^pdf^ 169.41 KB) Additional Information: full citation , abstract , references , index terms 

The quantum Fourier transform (QFT) is a principal Ingredient appearing in many efficient 
quantum algorithms. We present a generic framework for the construction of efficient 
quantum circuits for the QFT by ''quantizing" the highly successful separation of variables 
technique for the construction of efficient classical Fourier transforms. Specifically, we 
apply Bratteli diagrams, GelTand-Tsetlin bases, and strong generating sets of small 
adapted diameter to provide effi ... 

Keywords: Quantum computation, group theory 



6 IVIultiple-transform pipelines for image coding 
A. Antola 

June 1988 Proceedings of the 2nd international conference on Supercomputing ICS 
'88 

Publisher: ACIVI Press 

Full text available: ^ pdf(720.79 KB) Additional Information: full citation , abstract , references , index terms 

Pipelined VLSI/WSI architectures supporting image coding transforms are defined and 
evaluated in the paper. The structures proposed in the paper have been derived by 
considering a common algorithmic kernel of the set of examined transforms. The 
possibility of reducing the computations to a common algorithmic version allows definition 
of flexible structures characterized by a "basic" pipeline - performing the common kernel 



http://portaLacm.org/resultsxfm?coll=ACM&dl=ACM&CFID=16740954& 3/12/07 



•Results (page 1): hadamard, order, bit 



Page 3 of 6 



7 



of computation - and by transform-dependent input and out ... 

An introduction to quantum computing for non-physicists 
September 2000 ACM Computing Surveys (CSUR), volume 32 issue 3 
Publisher: ACM Press 

Full text available* S.pdf(491 89 KB) ^^^'^'^"^l Information: full citation , abstract , references , citings , index 
* ^ ' terms , review 

Richard Feynman's observation that certain quantum mechanical effects cannot be 
sinriulated efficiently on a computer led to speculation that computation in general could 
be done more efficiently if it used these quantum effects. This speculation proved justified 
when Peter Shor described a polynomial time quantum algorithm for factoring intergers.In 
quantum systems, the computational space increases exponentially with the size of the 
system, which enables exponential parallelism. ... 

Keywords: complexity, parallelism, quantum computing 



8 S pecial session on reliable computing: A dependability perspective on emerging 
^ technologies 

Lucian Prodan, Mihai Udrescu, Mircea Vladutiu 

May 2006 Proceedings of the 3rd conference on Computing frontiers CF '06 
Publisher: ACM Press 

Full text available: ^ pdf(660.67 KB) Additional Information: full citation , abstract , references , index terms 

Emerging technologies are set to provide further provisions for computing In times when 
the limits of current technology of microelectronics become an ever closer presence. A 
technology roadmap document lists biologically-Inspired computing and quantum 
computing as two emerging technology vectors for novel computing architectures [43]. 
But the potential benefits that will come from entering the nanoelectronics era and from 
exploring novel nanotechnologies are foreseen to come at the cost of incr ... 

Keywords: bio-inspired computing, bio-inspired digital design, dependability, 
embryonics, emerging technologies, evolvable hardware, fault-tolerance assessment, 
quantum computing, reliability 



^ Shake 'em, but don't crack 'em: Cracking the Bluetooth PIN Q 
^ Yaniv Shaked, Avishai Wool 

June 2005 Proceedings of the 3rd international conference on Mobile systems, 
applications, and services MobiSys '05 

Publisher: ACM Press 

Full text available: pdf(223.67 KB) Additional Information: full citation , abstract , references 

This paper describes the Implementation of an attack on the Bluetooth security 
mechanism. Specifically, we describe a passive attack, in which an attacker can find the 
PIN used during the pairing process. We then describe the cracking speed we can achieve 
through three optimizations methods. Our fastest optimization employs an algebraic 
representation of a central cryptographic primitive (SAFER+) used in Bluetooth. Our 
results show that a 4-digit PIN can be cracked in less than 0.3 sec on an old ... 



10 



Quantum computing: Quantum designer and network simulator 
Sander Imre, Peter Abronits, Daniel Darabos 

April 2004 Proceedings of the 1st conference on Computing frontiers CF '04 
Publisher: ACM Press 

Full text available: ^pdf(771.18 KB) Additional Infomiation: full citation , abstract , references , index ternns 
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In this paper we introduce our new quantum circuit design tool. Based on quantum 
mechanical models a universal discrete-time quantum-network designer and simulator 
was implemented. The graphical user interface allows the user to design complex 
quantum networks efficiently. The component-based architecture enables independent 
researchers to use our simulator API (written in C), while we continue to expand and 
refine the C# based user interface. Future plans include components for distributed 
simu ... 

Keywords: quantum algorithms and circuits, quantum computing, simulation 



11 Datapath and control for quantum wires B 
Nemanja Isailovic, Mark Whitney, Yatish Patel, John Kubiatowicz, Dean Copsey, Frederic T. 
Chong, Isaac L. Chuang, l^ark Oskin 

March 2004 ACM Transactions on Architecture and Code Optimization (TACO), volume i 

Issue 1 ' 
Publisher: ACM Press 

rr II* ^ -I ui 0 ^x/^^e oo i^m Additional Information: full citation , abstract , references , citing s, index 

Full text available: 1p 3pdf(476.83 KB) ^ ^- 

^ terms 

As quantum computing moves closer to reality the need for basic architectural studies 
becomes more pressing. Quantum wires, which transport quantum data, will be a 
fundamental component in all anticipated silicon quantum architectures. Since they 
cannot consist of a stream of electrons, as in the classical case, quantum wires must 
fundamentally be designed differently. In this paper, we present two quantum wire 
designs: a swap wire, based on swapping of adjacent qubits, and a teleportation wire, ... 

Keywords: Architecture, Control, Layout 



12 Locally testable codes and PCPs of almost-linear length 
Oded Goldreich, Madhu Sudan 

July 2006 Journal of the ACM (JACM), volume 53 issue 4 
Publisher: ACM Press 

Full text available: ^ pdf(749.48 KB) Additional Information: full citation , abstract , references , index terms 

We initiate a systematic study of locally testable codes; that is, error-correcting codes 
that admit very efficient membership tests. Specifically, these are codes accompanied 
with tests that make a constant number of (random) queries into any given word and 
reject non-codewords with probability proportional to their distance from the code. Locally 
testable codes are believed to be the combinatorial core of PCPs. However, the relation is 
less Immediate than commonly believed. Nevertheless, we sho ... 

Keywords: Proof verification, derandomizatlon, error-correcting codes, probabilistically 
checkable proofs 



13 Networks I: The effect of comnnunication costs in solid-state quantum computing 
architectures 

Dean Copsey, Mark Oskin, Tzvetan Metodiev, Frederic T. Chong, Isaac Chuang, John 
Kubiatowicz 

June 2003 Proceedings of the fifteenth annual ACM symposium on Parallel 

algorithms and architectures SPAA '03 
Publisher: ACM Press 

Full text available: fgl Ddfn49.00 KB) Additional Information: full citation , abstract, references , citings, inde? 

terms 
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Quantum computation has become an intriguing technology with which to attack difficult 
problems and to enhance system security. Quantum algorithms, however, have been 
analyzed under Idealized assumptions without important physical constraints in mind. In 
this paper, we analyze two key constraints: the short spatial distance of quantum 
interactions and the short temporal life of quantum data. In particular, quantum 
computations must make use of extremely robust error correction techniques to exten ... 

Keywords: quantum architecture, quantum computing, silicon-based quantum 
computing 

Two-tinnescale simultaneous perturbation stochastic a p proximation using Q 
^ deterministic perturbation sequences 

^ Shalabh Bhatnagar, Michael C. Fu, Steven I. Marcus, I-Jeng Wang 

April 2003 ACM Transactions on Modeling and Computer Simulation (TOMACS), volume 

13 Issue 2 
Publisher: ACM Press 

Full text available: fgl Ddfr294.83 KB) Additional Information: full citation , abstract, references , citings, index 

terms 

Simultaneous perturbation stochastic approximation (SPSA) algorithms have been found 
to be very effective for high-dimensional simulation optimization problems. The main Idea 
is to estimate the gradient using simulation output performance measures at only two 
settings of the W-dimensional parameter vector being optimized rather than at the /V + 1 
or 2N settings required by the usual one-sided or symmetric difference estimates, 
respectively. The two settings of the para ... 

Keywords: Hadamard matrices, SPSA, Simulation optimization, deterministic 
perturbations, stochastic approximation, two-tlmescale algorithms 

15 Symbolic simulation and verification: Gate-level sinnulation of quantum circuits Q 
George F. Viamontes, Manoj Rajagopalan, Igor L Markov, John P. Hayes 
January 2003 Proceedings of the 2003 conference on Asia South Pacific design 

automation ASPDAC 
Publisher: ACM Press 

Full text available: " glpcifd 14.53 KB) Additional Information: full citation , abstract , references , citings 

Simulating quantum computation on a classical computer is a difficult problem. The 
matrices representing quantum gates, and vectors modeling qubit states grow 
exponentially with an increase in the number of qubits. However, by using a new data 
structure called the Quantum Information Decision Diagram (QuIDD) that exploits the 
structure of quantum operators, many of these matrices and vectors can be represented 
in a form that grows polynomlally. Using QuIDDs, we implemented a general-purpose 
quan ... 

16 Ap proximatin g the domatic number [li 
Uriel Feige, Magnus M. Halldorsson, Guy Kortsarz 

May 2000 Proceedings of the thirty-second annual ACM symposium on Theory of 
computing STOC '00 

Publisher: ACM Press 

Full text available: ^ pdf(1.10 MB) Additional Information: full citation , references , citings, index terms 
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August 1994 ACM SIGAPL APL Quote Quad , Proceedings of the international 
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its applications APL '94, volume 25 issue i 

Publisher: ACM Press 

Full text available: 1 llDdf(1.89MB^ Additional Information: full citation , references , citings, index terms . 
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''S Poster session: FPGA implementation of a fast Hadamard transformer for WCDMA Q 
Sanat Kamal Bahl, Jim Plusquellic 

February 2003 Proceedings of the 2003 ACM/SIGDA eleventh international 
symposium on Field programmable gate arrays FPGA '03 

Publisher: ACM Press 

Full text available: 'g| pdf(1 87.05 KB) Additional Infonmation: full citation , abstract 

In code division multiple access (CDMA) systems the base station Identifies each user In a 
cell by unique orthogonal (Walsh) codes. The Walsh codes are generated at the 
transmitter using a Walsh-Hadamard function. A Fast Hadamard Transformer (FHT) is 
used at the receiver to decode the transmitted codes. The purpose of this study is to 
design a FHT which utilizes less hardware resources as compared to the existing designs 
and also suggest means for reducing the input length of the Walsh sequence. ... 

19 Concurrent error detection of fault-based side-channel cryptanalysis of 128-bit Q 

symmetric bl ock cip hers 

Ramesh Karri, Kaljie Wu, Piyush Mishra, Yongkook Kim 
June 2001 Proceedings of the 38th conference on Design automation DAC '01 

Publisher: ACM Press 

Full text available: pdf(260.32 KB ) Additional Information: full citation , abstract , references , index terms 

Fault-based side channel cryptanalysis is very effective against symmetric and 
asymmetric encryption algorithms. Although straightforward hardware and time 
redundancy based concurrent error detection (CED) architectures can be used to thwart 
such attacks, they entail significant overhead (either area or performance). In this paper 
we investigate systematic approaches to low-cost, low-latency CED for symmetric 
encryption algorithms based on the Inverse relationship that exists between encryp ... 

20 MPEG: a video compression standard for multimedia applications Q 
Didier Le Gall 

April 1991 Communications of the ACM, volume 34 issue 4 
Publisher: ACM Press 

Full text available* 1?^ Ddf(9 16MB) Additional Information: full citation , references , citin gs, index terms . 

^ review 
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1 Reconfjgurable Signal Processing in Wireless Terminals 

Jurgen Helnnschmidt, Eberhard Schuler, Prashant Rao, Sergio Rossi, Serge dl Matteo, Rainer 

Bonitz 

March 2003 Proceedings of the conference on Design, Automation and Test in 
Europe: Designers' Forum - Volume 2 DATE '03 

Publisher: IEEE Computer Society 
Full text available: fi^l pdf(399.68 KB 



' Publisher Site 



Additional Information: full citation , abstract , citings, index terms 



In this paper, we show the necessity of reconfigurable hardware for data and signal 
processing in wireless mobile terminals. We first identify the key processing power 
requirements for realizing a third generation wireless mobile terminal with multi-link and 
multi-standard capabilities. This is done on the basis of two world applications: a flexible 
mobile rake receiver for UMTS/W-CDMA and an OFDM decoder for high-speed wireless 
LAN protocols. We present a software-defined concept and a system i ... 

2 Circuit techniques for scaled technologies: A two-port SRAM for real-time video 
^ processor saving 53% of bitline power with majority logic and data-bit reordering 
^ Hidehiro Fujiwara, Koji Nii, Junichi Miyakoshi, Yuichiro Murachi, Yasuhiro Morita, Hiroshi 
Kawaguchi, Masahiko Yoshimoto 

October 2006 Proceedings of the 2006 international symposium on Low power 

electronics and design ISLPED '06 
Publisher: ACM Press 

Full text available:^ pdf(382.04 KB) Additional Information: full citatibn . abstract , references , index terms 

We propose a low-power two-port SRAM suitable for real-time video processing. In order 
to minimize discharge power on a read bitline, a majority-logic decides if input data are 
inverted in a write cycle, so that "l"s are in the majority. In video data, since more 
significant bits of adjacent pixel data are fortunately lopsided to either "0" or "1" with 
higher probability, the data bits in the pixels are reordered in each digit group to exploit 
the majority logic. The speed and area overheads are ... 

Keywords: data-bit reordering, low power SI=IAM, majority logic, real-time image 
processing, two-port SRAM 
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June 2005 ACM SIGPLAN Notices , Proceedings of the 2005 ACM SIGPLAN/SIGBED 
conference on Languages, compilers, and tools for embedded systems 

LCTES '05, Volume 40 Issue 7 
Publisher: ACM Press 

Full text available: Ddf(356.05 KB) Information: full citation , abstract, references , citings, index 

^^^""^ terms 

Short vector (SIIMD) instructions are useful in signal processing, multimedia, and scientific 
applications. They offer higher performance, lower energy consumption, and better 
resource utilization. However, compilers still do not have good support for SIMD 
instructions, and often the code has to be written manually in assembly language or using 
compiler builtin functions. Also, in some applications, higher parallelism could be achieved 
if compilers inserted permutation instructions that reorder t ... 

Keywords: SIMD, permutations 



* Implicit array bounds checking on 64-bit architectures 

Chris Bentley, Scott A. Watterson, David K. Lowenthal, Barry Rountree 
December 2006 ACI^ Transactions on Arcliitecture and Code Optimization (TACO), 

Volume 3 Issue 4 
Publisher: ACM Press 

Full text available: ^ pdf(548. 20 KB ) Additional Information: full citation , abstract , references , index terms 

Several programming languages guarantee that array subscripts are checked to ensure 
they are within the bounds of the array. While this guarantee improves the correctness 
and security of array-based code, it adds overhead to array references. This has been an 
obstacle to using higher-level languages, such as Java, for high-performance parallel 
computing, where the language specification requires that all array accesses must be 
checked to ensure they are within bounds. This is because, in practic ... 

Keywords: 64-bit architectures. Array-bounds checking, virtual memory 





5 Caching I: New results on web caching with request reordering 

^ Susanne Albers 

June 2004 Proceedings of tlie sixteenth annual ACM symposium on Parallelism in 
algorithms and architectures SPAA '04 

Publisher: ACM Press 

Full text available: ^pdf( 186.52 KB ) Additional Information: full citat ion, abstract , references , index terms 

We Study web caching with request reordering. The goal is to maintain a cache of web 
documents so that a sequence of requests can be served at low cost. To improve cache 
hit rates, a limited reordering of requests is allowed. Feder et al. [6], who recently 
introduced this problem, considered caches of size 1, i.e. a cache can store one 
document. They presented an offline algorithm based on dynamic programming as well as 
online algorithms that achieve constant factor competitive ratios. For arbit ... 

Keywords: approximation, batch, cache, competitive, document, offline, online 



6 Using Rewriting Rules and Positive Equality to Formally Verify Wide-Issue Out-of- 
Order Microprocessors with a Reorder Buffer 
M. Velev 

March 2002 Proceedings of the conference on Design, automation and test in Europe 
DATE '02 

Publisher: IEEE Computer Society 
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Full text available: ^pdfM 92.83 KB) Additional Information: full citation , abstract , citings 

Rewriting rules and Positive Equality [4] are combined In anautomatic way In order to 
formally verify out-of-order proces-sorsthat have a Reorder Buffer, and can issue/retire 
multipleinstructions per clock cycle. Only register-register instructionsare implemented, 
and can be executed out-of-order, as soon astheir data operands can.be either read from 
the Register File, orforwarded as results of instructions ahead in program order inthe 
Reorder Buffer. The verification is based on the Burch andDi ... 

Packet reordering is not pathological netv\/ork behavior 
Jon C. R. Bennett, Craig Partridge, Nicholas Shectman 

December 1999 IEEE/ ACM Transactions on Networking (TON), volume t issue 6 
Publisher: IEEE Press 

Full text available:^ pdf(1 07.65 KB) Additional Information: full citation , references , citings , index terms 



Keywords: Internet, communication system traffic, pacl<et switching 

8 W1-C: general sy m posium: New non-blocking EOVSF codes for multi-rate WCDMA Q 
^ system 

^ Yili-Fuh Wang, Hsing-Hu Chen, Tun-Ying Lin 

July 2006 Proceeding of the 2006 international conference on Communications and 
mobile computing IWCMC '06 

Publisher: ACM Press 

Full text available: ^ pdf(886.94 KB ) Additional Information: full citation , abstract , references , index terms 

Orthogonal variable spreading factor (OVSF) codes are employed in the third generation 
(3G) wideband code division multiple access (WCDMA) wireless system as channelization 
codes. Any two codes OVSF of different levels are orthogonal if and only If one of two 
codes is not ancestor/descendant in each other. Therefore, when an OVSF code is 
assigned to a user, it blocks all of its ancestor and descendant codes. This results in a 
major drawback of OVSF codes, called blocking property: When an OVSF c ... 

Keywords: EOVSF codes, OVSF codes, WCDMA, third generation (3G) 



Scan-BIST based on cluster analysis and the encoding of repeatin g sequences 
Lei LI, Zhanglei Wang, Krishnendu Chakrabarty 

January 2007 ACM Transactions on Design Automation of Electronic Systems 

(TODAES), Volume 12 Issue 1 
Publisher: ACM Press 

Full text available: ^ pdf(523.07 KB) Additional Information: full citation , abstract , references , index terms 

We present a built-in self-test (BIST) approach for full-scan designs that extracts the 
most frequently occurring sequences from deterministic test patterns. The extracted 
sequences are stored on-chip, and are used during test application. Three sets of test 
patterns are applied to the circuit under test during a BIST test session; these include 
pseudorandom patterns, semirandom patterns, and deterministic patterns. The 
semirandom patterns are generated based on the stored sequences and they are ... 

Keywords: Built-in self-test (BIST), clustering test data volume, test compression 
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March 2003 Proceedings of the international symposium on Code generation and 
optimization: feedbaclc-directed and runtime optimization CGO '03 

Publisher: IEEE Computer Society 

Full text availabie:1 jglDdff1.31 MB) Additional Information: full citation , abstract, references , citings, index 
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Memory access has proven to be one of the bottlenecks in modern architectures. 
Improving memory locality and eliminating the amount of memory access can help 
release this bottleneck. We present a method for link-time profile-based optimization by 
reordering the global data of the program and modifying its code accordingly. The 
proposed optimization reorders the entire global data of the program, according to a 
representative execution rate of each instruction (or basic block) in the code. The da ... 

WISQ: a restartable architecture using queues 

A. R. Pleszkun, J. R. Goodman, W. C. Hsu, R. T. Joersz, G. Bier, P. Woest, P. B. Schechter 
June 1987 Proceedings of the 14tli annual international symposium on Computer 

arcliitecture ISCA '87 
Publisher: ACM Press 

Full text available- ^ fglodfn.M MB) Additional Information: full citation , abstract, references , citings, index 
. i^r- terms 

In this paper, the WISQ architecture is described. This architecture is designed to achieve 
high performance by exploiting new compiler technology and using a highly segmented 
pipeline. By having a highly segmented pipeline, a very-high-speed clock can be used. 
Since a highly segmented pipeline will require relatively long pipelines, a way must be 
provided to minimize the effects of pipeline bubbles that are formed due to data and 
control dependencies. It is also important to provide a way ... 

12 Temperature and power aware architectures: Reducing reorder buffer complexity 
^ through selective operand caching 

^ Gurhan Kucuk, Dmitry Ponomarev, Oguz Ergln, Kanad Ghose 

August 2003 Proceedings of the 2003 international symposium on Low power 

electronics and design ISLPED '03 
Publisher: ACM Press 

Additional Information: full citation , abstract , references , citings , index 



Full text available: 

*£=^-'^ temns 

Modern superscalar processors implement precise interrupts by using the Reorder Buffer 
(ROB). In some microarchitectures , such as the Intel P6, the ROB also serves as a 
repository for the uncommitted results. In these designs, the ROB is a complex multi- 
ported structure that dissipates a significant percentage of the overall chip power. 
Recently, a mechanism was introduced for reducing the ROB complexity and its power 
dissipation through the complete elimination of read ports for reading out so ... 

Keywords: low-complexity datapath, low-power design, reorder buffer, short-lived 
values 



13 Compilers: Innplicit java array bounds checking on 64-bit architecture 

# Chris Bentley, Scott A. Watterson, David K. Lowenthal, Barry Rountree 
June 2004 Proceedings of the 18tli annual international conference on 

Supercomputing ICS '04 
Publisher: ACM Press 

Full text available:^ pdfd 88.75 KB) Additional Information: full citation , abstract , references , index terms 

Interest in using Java for high-performance parallel computing has increased in recent 
years. One obstacle that has inhibited Java from widespread acceptance in the scientific 
community is the language requirement that all array accesses must be checked to 
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ensure they are within bounds. In practice, array bounds checking in scientific 
applications may increase execution tinne by nnore than a factor of 2. Previous research 
has explored optimizations to statically eliminate bounds checks, but the dy ... 

Keywords: array-bounds checking, Java, virtual memory 



14 Co-s y nthesis of pipelined structures and instruction reordering constraints for Q 
^ i nstruction set processors 
^ Ing-Jer Huang 

January 2001 ACM Transactions on Design Automation of Electronic Systems 

(TODAES), Volume 6 Issue 1 
Publisher: ACM Press 

Full text available: ^pdf(1.58 MB) Additional Information: full citation , abstract , references , index terms 

This paper presents a hardware/software co-synthesis approach to pipelined ISP 
(instruction set processor) design. The approach synthesizes the pipeline structure from a 
given instruction set architecture (behavioral) specification. In addition, It generates a set 
of reordering constraints that guides the compiler back-end (reorderer) to properly 
schedule instructions so that possible pipeline hazards are avoided and throughput is 
improved. Co-synthesis takes place while resolving ... 

Keywords: compiler instruction optimization^ instruction set processor, pipeline hazards, 
pipeline taxonomy, synthesis 



15 Testing: Two dimensional reordering of functional test data for compression by ATE Q 

Hamldreza Hashempour, Fabrlzio Lombard! 
^ April 2005 Proceedings of the 15th ACM Great Lakes symposium on VLSI GLSVSLI '05 

Publisher: ACM Press 

Full text available: ^pdf(121.41 KB) Additional Information: full citation , abstract , references , index ternns 

This paper presents a novel approach for compressing functional test data in Automatic 
Test Equipment (ATE). A practical technique is presented for 2 Dimensional (2D) 
reordering of test data in which additionally to test vector reordering, column reordering 
is also applied. An ATE based approach to extract the original test vectors from the 2D 
ordered data is presented. The advantage of the approach is substantiated using the 
figure of merit of entropy for the 2D ordered test data of ISCAS bench ... 

Keywords: 2D reordering, ATE, column reordering, functional test data, scan test data, 
test data compression 



PL/I pro g ram efficienc y Q 
^ Michael McNeil, William Tracz 

June 1980 ACM SIGPLAN Notices, volume is issue 6 
Publisher: ACM Press 

Full text available: ^ pdf(1.29 MB) Additional Information: full citation , abstract , references, citings 

All Piyi Programmers should be aware of and genuinely concerned about Piyi Program 
efficiency. This paper addresses the following question: "How do you write a PL/I program 
which the PL/I Compiler will reduce to the smallest and fastest executing machine 
language module?" The real world payoffs of knowing how the PL/I Optizing Compiler 
handles different syntactical representations of similar semantic relationships with respect 
to code generation and storage allocation can Increase program runtime ... 
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17 Re-Configurable Bus Encoding Scheme for Reducing Power Consumption of the 
Cross Coupling Capacitance for Deep Sub-Micron Instruction Bus 

Siu-Kei Wong, Chi-Ying Tsui 

February 2004 Proceedings of the conference on Design, automation and test in 
Europe - Volume 1 DATE '04 

Publisher: IEEE Computer Society 

Full text available: ^ pdf(1 52.05 KB) Additional Information: full citation , abstract , index terms 

In very deep sub-micron designs, cross coupling capacitances become the dominant 
factor of the total bus loading and have a significant impact on the power consumption. In 
this paper, we propose two reconfigurable bus encoding schemes, which are based on the 
correlation among the bit lines, to reduce the power consumption at the cross coupling 
capacitances of the instruction buses. The instruction is encoded by flipping and 
reordering the bit lines during compilation time to reduce the total swi ... 

18 Poster session 2: Orthogonal code generator for 3G wireless transceivers 

Boris D. Andreev, Edward L. Titlebaum, Eby G. Friedman 

April 2003 Proceedings of the 13th ACM Great Lakes symposium on VLSI GLSVLSI '03 
Publisher: ACM Press 

Full text available: '^ pdf(152.16 KB) Additional Information: full citation , abstract , references , index terms 

Orthogonal variable spreading factor (OVSF) codes are standard in third generation UMTS 
cellular systems. The efficient generation of these codes is essential for reducing the area 
and power of wireless transceivers. In this paper, the basic properties of this family of 
codes are analyzed from an RTL perspective and two efficient hardware code generators 
are proposed. Tradeoffs and design solutions as well as low power considerations are 
discussed. These results represent the first reported imp! ... 

Keywords: 3GPP, CDMA, OVSF codes, UMTS, VLSI, WCDMA 



'1 9 Fabric-driven logic synthesis: Layout-aware synthesis of arithmetic circuits 
Junhyung Um, Taewhan Kim 

June 2002 Proceedings of the 39th conference on Design automation DAC '02 
Publisher: ACM Press 

Full text available* fB Ddfd 27 44 KB) Additional Information: full citation , abstract , references , citings , index 
* T^J-^™"^ ^ terms 

In deep sub-micron (DSM) technology, wires are equally or more Important than logic 
components since wire-related problems such as crosstalk, noise are much critical in 
system-on-chip (SoC) design. Recently, a method [12] for generating a partial product 
reduction tree (PPRT) with optimal-timing using bit-level adders to implement arithmetic 
circuits, which outperforms the current best designs, is proposed. However, in the 
conventional approaches including [12], interconnects are not primary com ... 

Keywords: carry-save-adder, high performance, layout 



20 Ad hoc networks: OVSF-CDMA code assignnnent in wireless ad hoc networks 
Peng-Jun Wan, Xiang-Yang Li, Ophir Frieder 

October 2004 Proceedings of the 2004 joint workshop on Foundations of mobile 
computing DIALi^-POMC '04 

Publisher: ACM Press 

Full text available* 151 pdf(198 79 KB) A^^'*'^"^' Information: full citation , abstract , references , citings, index 
* terms 

Orthogonal Variable Spreading Factor (OVSF) CDMA code provides a means of support of 
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variable rate data service at low liardware cost. In contrast to the conventional ortliogonal 
fixed-spreadlng-factor CDMA code, OVSF-CDMA code consists of an infinite number of 
codewords with variable rates but not every pair of codewords are orthogonal to each 
other. In an OVSF-CDMA wireless ad hoc network, a code assignment has to be conflict- 
free, i.e., two nodes can be assigned the same codeword or two non-ort ... 

Keywords: OVSF-CDMA, approximation algorithms, code assignment, graph theory, 
system design 
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1 Optimizing data permutations for SIMP devices 
Gang Ren, Peng Wu, David Padua 

June 2006 ACM SIGPLAN Notices , Proceedings of the 2006 ACM SIGPLAN conference 
on Programming language design and implementation PLDI '06, volume 4i 

Issue 6 
Publisher: ACM Press 

Additional Information: full citation , abstract , references , citings , index 
terms 

The widespread presence of SIMD devices in today's microprocessors has made compiler 
techniques for these devices tremendously important. One of the most important and 
difficult issues that must be addressed by these techniques is the generation of the data 
permutation instructions needed for non-contiguous and misaligned memory references. 
These instructions are expensive and, therefore, it is of crucial importance to minimize 
their number to improve performance and, in many cases, enable speed ... 



Full text available: ^ pdf(260.98 KB) 



Keywords: SIMD compilation, data permutation, optimization 



2 Effective compiler generation by architecture description 
^ Stefan Farfeleder, Andreas Krall, Edwin Steiner, Florian Brandner 

^ June 2006 ACM SIGPLAN Notices , Proceedings of the 2006 ACM SIGPLAN/SIGBED 
conference on Language, compilers and tool support for embedded 
systems LCTES '06, volume 41 issue 7 
Publisher: ACM Press 

Full text available: ^ pdf(128.18 KB ) Additional Information: full citation , abstract , references , index terms 

Embedded systems have an extremely short time to market and therefore require easily 
retargetable compilers. Architecture description languages (ADLs) provide a single concise 
architecture specification for the generation of hardware, instruction set simulators and 
compilers. In this article, we present an ADL for compiler generation. From a specification, 
we can derive an optimized tree pattern matching instruction selector, a register allocator 
and an instruction scheduler. Compared to a hand- ... 

Keywords: architecture description language, code generation, compiler generation 
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^ November 2002 Proceedings of the 2002 IEEE/ ACM international conference on 
<^ Computer-aided design ICCAD '02 

Publisher: ACM Press 

Full text available:lTl pdf(246.56 KB) '"formation: full citation . iMriEt. references , citings, index 

terms 

Reversible or information-lossless circuits fiave applications in digital signal processing, 
communication, computer graphics and cryptography. They are also a fundamental 
requirement in the emerging field of quantum computation. We investigate the synthesis 
of reversible circuits that employ a minimum number of gates and contain no redundant 
input-output line-pairs (temporary storage channels). We prove constructively that every 
even permutation can be implemented without temporary storage using ... 

^ Advances in synthesis: Transfornnation rules for designing CNOT-based quantum 
circuits 

Kazuo Iwama, Yahiko Kambayashi, Shigeru Yamashita 

June 2002 Proceedings of the 39th conference on Design automation DAC '02 

Publisher: ACM Press 

Full text available:g| pdf(283,51 KB) Additional Information: full citation , abstract, references , dtiogs. index 

terms 

This paper gives a simple but nontrivial set of local transformation rules for Control-NOT 
(CNOT)-based combinatorial circuits. It is shown that this rule set is complete, namely, 
for any two equivalent circuits, SI and S2, there is a sequence of transformations, each of 
them in the rule set, which changes SI to S2. Our motivation is to use this rule set for 
developing a design theory for quantum circuits whose Boolean logic parts should b ... 

Keywords: CNOTgate, local transformation rules, quantum circuit 



S ymbo li c s imulatio n an d verification : G ate- l evel simulation of quantum circuits 
George F. Viamontes, Manoj Rajagopalan, Igor L Markov, John P. Hayes 
January 2003 Proceedings of the 2003 conference on Asia South Pacific design 

automation ASPDAC 
Publisher: ACM Press 

Full text available: "g pdffl 14.53 KB) Additional Information: full citation , abstract , references , citings 

Simulating quantum computation on a classical computer is a difficult problem. The 
matrices representing quantum gates, and vectors modeling qu bit states grow 
exponentially with an increase in the number of qubits. However, by using a new data 
structure called the Quantum Information Decision Diagram (QuIDD) that exploits the 
structure of quantum operators, many of these matrices and vectors can be represented 
in a form that grows polynomially. Using QuIDDs, we implemented a general-purpose 
quan ... 

6 Systems session 3: assorted topics: Very low complexity MPEG-2 to H.264 

transcoding usin g machine learnin g 
^ Gerardo Fernandez, Pedro Cuenca, Luis Orozco Barbosa, Hari Kalva 

October 2006 Proceedings of the 14th annual ACM international conference on 

l^ultimedia i^ULTIMEDIA '06 
Publisher: ACM Press 

Full text available: g pdf(1. 33 MB) Additional Information: full citation , abstract , references , index terms 

This paper presents a novel macroblock mode decision algorithm for inter-frame 
prediction based on machine learning techniques to be used as part of a very low 
complexity MPEG-2 to H.264 video transcoder. Since coding mode decisions take up the 
most resources in video transcoding, a fast macro block (MB) mode estimation would lead 
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to reduced complexity. The proposed approach is based on the hypothesis that MB coding 
mode decisions in H.264 video have a correlation with the distribution of the mo ... 

Keywords: H.264, MPEG-2, inter-frame, machine learning, transcoding 



Poster Session 2: Odd/even bus invert with two-phase transfer for buses with | 
coupling 

Yan Zhang, John Lach, Kevin Skadron, Mircea R. Stan 

August 2002 Proceedings of the 2002 international symposium on Low power 
electronics and design ISLPED '02 

Publisher: ACM Press 

^ -I ui 01 ^r/oor^ oo t^Dv Additional Information: full citation , abstract , references , citings , index 

Full text available: TO pdf(239.83 KB) ^ 

terms 

The coupling capacitances between on-chip bus lines become dominant in deep-submicron 
technologies. Coding to reduce the switching activity of the individual lines was enough to 
reduce power on buses in older technologies, but new coding techniques that reduce the 
coupling activity between lines are needed for deep-submicron buses. One such coding 
technique uses the simple observation that coupling capacitances are always charged and 
discharged by activity on neighboring bus lines, ... 

Keywords: bus invert, buses with coupling, coding for low-power I/O 



8 Circuit techniques for scaled technologies: A two-port SRAM for real-time video Q 
processor savin g 53% of bitline power with majority lo g ic and data-bit reorderin g 
Hidehiro Fujiwara, Koji Nil, Junichi Miyakoshi, Yuichiro Murachi, Yasuhiro l^orlta, Hiroshi 
Kawaguchi, i^asahiko Yoshimoto 

October 2006 Proceedings of the 2006 international symposium on Low power 
electronics and design ISLPED '06 

Publisher: ACM Press 

Full text available: ^ pdf(382.04 KB) Additional Information: full citation , abstract , references , index terms 

We propose a low-power two-port SRAM suitable for real-time video processing. In order 
to minimize discharge power on a read bitline, a majority-logic decides if input data are 
inverted in a write cycle, so that "l"s are in the majority. In video data, since more 
significant bits of adjacent pixel data are fortunately lopsided to either "0" or "1" with 
higher probability, the data bits in the pixels are reordered in each digit group to exploit 
the majority logic. The speed and area overheads are ... 

Keywords: data-bit reordering, low power SRAM, majority logic, real-time Image 
processing, two-port SRAM 




Generation of permutations for SIMP processors I 
Alexei Kudriavtsev, Peter Kogge 

June 2005 ACM SIGPLAN Notices , Proceedings of tlie 2005 ACM SIGPLAN/SIGBED 
conference on Languages, compilers, and tools for embedded systems 
LCTES '05, Volume 40 Issue 7 

Publisher: ACM Press 

Full text available:® pdf(mQ5KBl Information: full citation , abstrac!. references , citings, index 

terms 

Short vector (SIMD) instructions are useful in signal processing, multinnedia, and scientific 
applications. They offer higher performance, lower energy consumption, and better 
resource utilization. However, compilers still do not have good support for SIMD 
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instructions, and often the code has to be written manually in assembly language or using 
compiler builtin functions. Also, in some applications, higher parallelism could be achieved 
if compilers inserted permutation instructions that reorder t ... 

Keywords: SIMD, permutations 



10 Implicit array bounds checking on 64-bit architectures 

^ Chris Bentley, Scott A. Watterson, David K. Lowenthal, Barry Rountree 

December 2006 ACM Transactions on Architecture and Code Optimization (TACO), 
Volume 3 Issue 4 

Publisher: ACM Press 

Full text available: ^ pdf(548.20 KB) Additional Information: full citation , abstract , references , index terms 

Several programming languages guarantee that array subscripts are checked to ensure 
they are within the bounds of the array. While this guarantee improves the correctness 
and security of array-based code, it adds overhead to array references. This has been an 
obstacle to using higher-level languages, such as Java, for high-performance parallel 
computing, where the language specification requires that all array accesses must be 
checl<ed to ensure they are within bounds. This is because, in practic ... 

Keywords: 64-bit architectures, Array-bounds checl<ing, virtual memory 




^ ^ Cac hin g I: New results on web caching with request reordering 
^ Susanne Albers 

^ June 2004 Proceedings of the sixteenth annual ACM symposium on Parallelism in 
algorithms and architectures SPAA '04 

Publisher: ACM Press 

Full text available: ^ pdf(1 86.52 KB) Additional Information: full citation , abstract , references , index terms 

We study web caching with request reordering. The goal is to maintain a cache of web 
documents so that a sequence of requests can be served at low cost. To improve cache 
hit rates, a limited reordering of requests is allowed. Feder et al. [6], who recently 
introduced this problem, considered caches of size 1, i.e. a cache can store one 
document. They presented an offline algorithm based on dynamic programming as well as 
online algorithms that achieve constant factor competitive ratios. For arbit ... 

Keywords: approximation, batch, cache, competitive, document, offline, online 
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12 Using Rewriting Rules and Positive Equality to Formally Verify Wide-Issue Out-of- 
Order Microprocessors with a Reorder Buffer 
M. Velev 

March 2002 Proceedings of the conference on Design, automation and test in Europe 
DATE '02 

Publisher: IEEE Computer Society 

Full text available: ^ pclf(1 92.83 KB) Additional Information: full citation , abstract , citings 

Rewriting rules and Positive Equality [4] are combined in anautomatic way in order to 
formally verify out-of-order proces-sorsthat have a Reorder Buffer, and can issue/retire 
multipleinstructions per clock cycle. Only register-register instructionsare implemented, 
and can be executed out-of-order, as soon astheir data operands can be either read from 
the Register File, orforwarded as results of instructions ahead in program order inthe 
Reorder Buffer. The verification is based on the Burch andDi ... 

13 

Packet reordering is not patholo g ical network behavior 
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Jon C. R. Bennett, Craig Partridge, Nicliolas Shectman 

December 1999 IEEE/ACM Transactions on Networking (TON), volume 7 issue 6 
Publisher: IEEE Press 

Full text available: ^ pdf(107.65 KB) Additional Information: full citation , references , citings , index terms 



Keywords: Internet, communication system traffic, packet switcliing 



14 Scan-BIST based on cluster analysis and the encoding of repeating sequences 
Lei Li, Zhanglei Wang, Krishnendu Chakrabarty 

January 2007 ACM Transactions on Design Automation of Electronic Systems 

(TODAES), Volume 12 Issue 1 
Publisher: ACM Press 

Full text available: ^ pdf(523.07 KB) Additional Information: full citation , abstract , references , index terms 

We present a built-in self-test (BIST) approacli for full-scan designs that extracts the 
most frequently occurring sequences from deterministic test patterns. The extracted 
sequences are stored on-chip, and are used during test application. Three sets of test 
patterns are applied to the circuit under test during a BIST test session; these include 
pseudorandom patterns, semirandom patterns, and deterministic patterns. The 
semirandom patterns are generated based on tlie stored sequences and they are ... 

Keywords: Built-in self-test (BIST), clustering test data volume, test compression 



^ Code optimization - 1: Optimization opportunities created b y g lobal data reorderin g 
Gadi Haber, Moshe Klausner, Vadim Eisenberg, Bilha Mendelson, Maxim Gurevich 
March 2003 Proceedings of the international symposium on Code generation and 
optimization: feedback-directed and runtime optimization CGO '03 

Publisher: IEEE Computer Society 

Full text available: m pdf (1.31 M B) Additional Information: full citation , abstract, refere nces, dtings, index 
" " terms 

Memory access has proven to be one of the bottlenecks in modern architectures. 
Improving memory locality and eliminating the amount of memory access can help 
release this bottleneck. We present a method for link-time profile-based optimization by 
reordering the global data of the program and modifying its code accordingly. The 
proposed optimization reorders the entire global data of the program, according to a 
representative execution rate of each instruction (or basic block) In the code. The da ... 

16 WISQ: a restartable architecture using queues 

A. R. Pleszkun, J. R. Goodman, W. C. Hsu, R. T. Joersz, G. Bier, P. Woest, P. B. Schechter 
June 1987 Proceedings of the 14th annual international symposium on Computer 

architecture ISCA '87 
Publisher: ACM Press 

Full text available: ■ gl pdf(1.14 MB) Additional Information: full citation , abstract, references , citings, index 
^ terms 

In this paper, the WISQ architecture is described. This architecture is designed to achieve 
high performance by exploiting new compiler technology and using a highly segmented 
pipeline. By having a highly segmented pipeline, a very-high-speed clock can be used. 
Since a highly segmented pipeline will require relatively long pipelines, a way must be 
provided to minimize the effects of pipeline bubbles that are formed due to data and 
control dependencies. It is also important to provide a way ... 

17 
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Temperature and power aware architectures: Reducin g reorder buffer complexity 

through selective operand caching 

Gurhan Kucuk, Dmitry Ponomarev, Oguz Ergin, Kanad Ghose 

August 2003 Proceedings of the 2003 international symposium on Low power 
electronics and design ISLPED '03 

Publisher: ACM Press 



Full text available: ■g | pdf(8Q.27 KB) 



Additional Information: full citation , abstract , references , citings , index 
terms 



Modern superscalar processors implement precise interrupts by using the Reorder Buffer 
(ROB). In some microarchitectures , such as the Intel P6, the ROB also serves as a 
repository for the uncommitted results. In these designs, the ROB is a complex multi- 
ported structure that dissipates a significant percentage of the overall chip power. 
Recently, a mechanism was introduced for reducing the ROB complexity and its power 
dissipation through the complete elimination of read ports for reading out so ... 

Keywords: low-complexity datapath, low-power design, reorder buffer, short-lived • 
values 



1 8 Compilers: Implicit jav a array b ounds checkin g on 64-bit architecture Q 
^ Chris Bentley, Scott A. Watterson, David K. Lowenthal, Barry Rountree 

June 2004 Proceedings of the IStli annual international conference on 
Supercomputing ICS '04 

Publisher: ACM Press 

Full text available: g pdf(188.75 KB) Additional Information: full citation , abstract , references , index terms 

Interest in using Java for high-performance parallel computing has increased in recent 
years. One obstacle that has inhibited Java from widespread acceptance in the scientific 
community is the language requirement that all array accesses must be checked to 
ensure they are within bounds. In practice, array bounds checking in scientific 
applications may increase execution time by more than a factor of 2. Previous research 
has explored optimizations to statically eliminate bounds checks, but the dy ... 



Keywords: array-bounds checking, java, virtual memory 



19 Co-synthesis of pipelined structures and instruction reordering constraints for 
instruction set processors 
Ing-Jer Huang 

January 2001 ACM Transactions on Design Automation of Electronic Systems 

(TODAES), Volume 6 Issue 1 
Publisher: ACM Press 

Full text available: ^ pdf(1.58 MB ) Additional Information: full citation , abstract , references , index terms 

This paper presents a hardware/software co-synthesis approach to pipelined ISP 
(instruction set processor) design. The approach synthesizes the pipeline structure from a 
given instruction set architecture (behavioral) specification. In addition, it generates a set 
of reordering constraints that guides the compiler back-end (reorderer) to properly 
schedule instructions so that possible pipeline hazards are avoided and throughput is 
Improved. Co-synthesis takes place while resolving ... 

Keywords: compiler instruction optimization^ instruction set processor, pipeline hazards, 
pipeline taxonomy, synthesis 
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Hamidreza Hashempour, Fabrizio Lombardi 

April 2005 Proceedings of the 15th ACM Great Lakes symposium on VLSI GLSVSLI '05 
Publisher: ACM Press 

Full text available: ^ pdf(121.41 KB) Additional information: full citation , abstract , references , index terms 

This paper presents a novel approacli for compressing functional test data in Automatic 
Test Equipment (ATE). A practical technique is presented for 2 Dimensional (2D) 
reordering of test data in which additionally to test vector reordering, column reordering 
is also applied. An ATE based approach to extract the original test vectors from the 2D 
ordered data is presented. The advantage of the approach is substantiated using the 
figure of merit of entropy for the 2D ordered test data of ISCAS bench ... 

Keywords: 2D reordering, ATE, column reordering, functional test data, scan test data, 
test data compression 
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